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Exploring the Metabolic and Genetic Control of 
Gene Expression on a Genomic Scale 

Joseph L DeRisi, Vishwanath R. Iyer, Patrick O. Brown* 

DNA microarrays containing virtually every gene of Saccharcmyces cerevisiae were used 
to carry out a comprehensive investigation of the temporal program of gene expression 
accompanying the metabolic shift from fermentation to respiration. The expression 
profil s observed for genes with known metabolic functions pointed to features of the 
metabolic reprogramming that occur during the diauxic shift, and the expression patterns 
of many previously uncharacterized genes provided dues to their possible functions. The 
same DNA microarrays were also used to identify genes whose expression was affected 
by deletion of the transcriptional co-repressor TUP1 or overexpression of the transcrip- 
tional activator YAP1. These results demonstrate the feasibility and utility of this ap- 
proach to genomewide exploration of gene expression patterns. 



Xhe complete sequences of nearly a dozen 
microbial genomes are known, and in the 
next several yean we expect to know the 
complete genome sequences of several 
metazoans, including the human genome. 
Defining the role of each gene in these 
genomes will be a formidable task, and un- 
derstanding how the genome functions as a 
whole in the complex natural history of a 
living organism presents an even greater 
challenge. 

Knowing when and where a gene is 
expressed often provides a strong clue as to 
its biological role. Conversely, the pattern 
of genes expressed in a cell can provide 
detailed information about its state. Al- 
though regulation of protein abundance in 
a cell is by no means accomplished solely 
by regulation of mRNA, virtually all dif- 
ferences in cell type or state are correlated 
with changes in the mRNA levels of many 
genes. This is fortuitous because the only 
specific reagent required to measure the 
abundance of the mRNA for a specific 
gene is a cDNA sequence. DNA microar- 
rays, consisting of thousands of individual 
gene sequences printed in a high-density 
array n a glass microscope slide (J, 2), 
provide a practical and economical tool 
for studying gene expression on a very 
large scale (3-6). 

Saccharomyces cerevisiae is an especially 

Department of Biochemistry. Stanford University School 
of Medicine. Howard Hughes Medical Institute. Stanford. 
CA 94305-5428, USA. 

•To whom cotTe sp on u ence should be addressed. E-ma* 



favorable organism in which to conduct a 
systematic investigation of gene expression. 
The genes are easy to recognize in the ge- 
nome sequence, cts regulatory elements are 
generally compact and close to the tran- 
scription units, much is already known 
about its genetic regulatory mechanisms, 
and a powerful set of tools is available for its 
analysis. 

A recurring cycle in the natural history 
of yeast involves a shift from anaerobic 
(fermentation) to aerobic (respiration) me- 
tabolism. Inoculation of yeast into a medi- 
um rich in sugar is followed by rapid growth 
fueled by fermentation, with the production 
of ethanol. When the fermentable sugar is 
exhausted, the yeast cells turn to ethanol as 
a carbon source for aerobic growth. This 
switch from anaerobic growth to aerobic 
respiration upon depletion of glucose, re- 
ferred to as the diauxic shift, is correlated 
with widespread changes in the expression 
of genes involved in fundamental cellular 
processes such as carbon metabolism, pro- 
tein synthesis, and carbohydrate storage 
(7). We used DNA microarrays to charac- 
terize the changes in gene expression that 
take place during this process for nearly the 
entire genome, and to investigate the ge- 
netic circuitry that regulates and executes 
this program. 

Yeast open reading frames (ORFs) were 
amplified by the polymerase chain reaction 
(PGR), with a commercially available set of 
primer pairs (8). DNA microarrays, con- 
taining approximately 6400 distinct DNA 
sequences, were printed onto glass slides by 



using a simple robotic printing device (9). 
Cells from an exponentially growing culture 
of yeast were inoculated into fresh medium 
and grown at 30°C for 21 hours. After an 
initial 9 hours of growth, samples were har- 
vested at seven successive 2-hour intervals, 
and mRNA was isolated (JO). Fluorescently 
labeled cDNA was prepared by reverse tran- 
scription in the presence of Cy3 (green) - 
or Cy5(red)-labeled deoxyuridine triphos- 
phate (dUTP) (I J) and then hybridized to 
the microarrays (J 2). To maximize the re- 
liability with which changes in expression 
levels could be discerned, we labeled cDNA 
prepared from cells at each successive time 
point with Cy5, then mixed it with a Cy3- 
labeled "reference" cDNA sample prepared 
from cells harvested at the first interval 
after inoculation. In this experimental de- 
sign, the relative fluorescence intensity 
measured for the Cy3 and Cy5 fluors at 
each array element provides a reliable mea- 
sure of the relative abundance of the corre- 
sponding mRNA in the two cell popula- 
tions (Fig. 1 ). Data from the series of seven 
samples (Fig. 2), consisting of more than 
43,000 expression-ratio measurements, 
were organized into a database to facilitate 
efficient exploration and analysis of the 
results. This database is publicly available 
on the Internet (13). 

During exponential growth in glucose- 
rich medium, the global pattern of gene 
expression was remarkably stable. Indeed, 
when gene expression patterns between the 
first two cell samples (harvested at a 2-hour 
interval) were compared, mRNA levels dif- 
fered by a factor of 2 or more for only 19 
genes (0 J%), and the largest of these dif- 
ferences was only 2.7-fold (14). However, as 
glucose was progressively depleted from the 
growth media during the course of the ex- 
periment, a marked change was seen in the 
global panern of gene expression. mRNA 
levels for approximately 710 genes were 
induced by a factor of at least 2, and the 
mRNA levels for approximately 1030 genes 
declined by a factor of at least 2. Messenger 
RNA levels for 183 genes increased by a 
factor of at least 4, and mRNA levels for 
203 genes diminished by a factor of at least 
4. About half of these differentially ex- 
pressed genes have no currently recognized 
function and are not yet named. Indeed, 
more than 400 of the differentially ex- 
pressed genes have no apparent homology 
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to any gene whose function is known (15). 
The responses of these previously un char- 
acterized genes to the diauxic shirt therefore 
provides the first small clue to their possible 
roles. 

The global view of changes in expres- 
sion f genes with known functions pro- 
vides a vivid picture of the way in which 
the cell adapts to a changing environ- 
ment. Figure 3 shows a portion of the yeast 
metabolic pathways involved in carbon 
and energy metabolism. Mapping the 
changes we observed in the mRNAs en- 
coding each enzyme onto this framework 
allowed us to infer the redirection in the 
fl w of metabolites through this system. 
We observed large inductions of the genes 
coding for the enzymes aldehyde dehydro- 
genase (ALD2) and acetyl-coenzyme 
A(CoA) synthase (ACS/), which func- 
tion together to convert the products of 
alcohol dehydrogenase into acetyl-CoA, 
which in turn is used to fuel the tricarbox- 
ylic acid (TCA) cycle and the glyoxylate 
cycle. The concomitant shutdown of tran- 
scription of the genes encoding pyruvate 
decarboxylase and induction of pyruvate 
carboxylase rechannels pyruvate away 
from acecaldehyde, and instead to oxalac- 
etate, where it can serve to supply the 
TCA cycle and gluconeogenesis. Induc- 
tion of the pivotal genes PCK1, encoding 
phosphoenolpyruvate carboxykinase, and 
FBPi, encoding fructose 1,6-biphos- 
phatase, switches the directions of two key 
irreversible steps in glycolysis, reversing 
the flow of metabolites along the revers- 
ible steps of the glycolytic pathway toward 
the essential biosynthetic precursor, glu- 
cose-6-phosphate. Induction of the genes 
coding for the trehalose synthase and gly- 
cogen synthase complexes promotes chan- 
neling of glucose-6-phosphate into these 
carbohydrate storage pathways. 

Just as the changes in expression of 
genes encoding pivotal enzymes can pro- 
vide insight into metabolic reprogram- 
ming, the behavior of large groups of func- 
tionally related genes can provide a broad 
view of the systematic way in which the 
yeast cell adapts to a changing environ- 
ment (Fig. 4). Several classes of genes, 
such as cytochrome c-related genes and 
those involved in the TCA/glyoxylate cy- 
cle and carbohydrate storage, were coordi- 
nate^ induced by glucose exhaustion. In 
contrast, genes devoted to protein synthe- 
sis, including ribosomal proteins, tRNA 
synthetases, and translation, el ngation, 
and initiation factors, exhibited a coordi- 
nated decrease in expression. More than 
95% of ribosomal genes showed at least 
twofold decreases in expression during the 
diauxic shift (Fig. 4) (i3). A noteworthy 
and illuminating exception was that the 



genes encoding mitochondrial ribosomal 
genes were generally induced rather than 
repressed after glucose limitation, high- 
lighting the requirement for mitchondrial 
biogenesis (13). As more is learned about 
the functions of every gene in the yeast 
genome, the ability to gain insight into a 
cell's response to a changing environment 
through its global gene expression patterns 
will become increasingly powerful. 

Several distinct temporal patterns of ex- 
pression could be recognized, and sets of 
genes could be grouped on the basis of the 
similarities in their expression patterns. The 
characterized members of each of these 
groups also shared important similarities in 
their functions. Moreover, in most cases, 
common regulatory mechanisms could be 
inferred for sets of genes with similar expres- 
sion profiles. For example, seven genes 
showed a late induction profile, with mRNA 
levels increasing by more than ninefold at 



the last timepoint but less dun threefold at 
the preceding timepoint (Fig. 5B). All of 
these genes were known to be glucose-re- 
pressed, and five of the seven were previously 
noted to share a common upstream activat- 
ing sequence (UAS), the carbon source re- 
sponse element (CSRE) (16-20). A search 
in die promoter regions of the rerriaining two 
genes, ACRJ and IDP2, revealed that 
ACR1, a gene essential for ACS1 activity, 
also possessed a consensus CSRE motif, but 
interestingly, IDP2 did not. A search of die 
entire yeast genome sequence for the con- 
sensus CSRE motif revealed only four addi- 
tional candidate genes, none of which 
showed a similar induction. 

Examples from additional groups f 
genes that shared expression profiles are 
illustrated in Fig. 5, C through F. The 
sequences upstream of the named genes in 
Fig. 5C all contain stress response ele- 
ments (STRE), and with the exception 
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of HSP42, have previously been shown t 
be c ntrolled at least in part by these 
elements (21-24). Inspection of the se- 
quences upstream of HSP42 and the two 
uncharacterized genes shown in Fig. 5C, 
YKL026c, a hypothetical protein with 
similarity to glutathione peroxidase, and 
YGR043c, a putative transaldolasc, re- 
vealed that each of these genes also pos- 
sess repeated upstream copies of the stress- 
responsive CCCCT motif. Of the 13 ad- 
ditional genes in the yeast genome that 
shared this expression profile [including 
HSP30, ALD2, OM45, and 10 uncharac- 
terized ORFs (25)], nine contained one or 
more recognizable STRE sites in their up- 
stream regions. 

The heterotrimeric transcriptional acti- 
vator complex HAP2,3 f 4 has been shown 
to be responsible for induction of several 
genes important for respiration {26-28). 
This complex binds a degenerate consensus 
sequence known as the CCAAT box (26). 
Computer analysis, using the consensus se- 
quence TNRYTGGB (29), has suggested 
that a large number of genes involved in 
respiration may be specific targets of 
HAP2 t 3,4 (30). Indeed, a putative 
HAP2,3,4 binding site could be found in 
the sequences upstream of each of the seven 
cytochrome c-related genes that showed 
the greatest magnitude of induction (Fig. 
5D). Of 12 additional cytochrome c-related 
genes that were induced, HAP2 t 3,4 binding 
sites were present in all but one. Signifi- 
cantly, we found that transcription of 
HAP4 itself was induced nearly ninefold 
concomitant with the diauxic shift. 

Control of ribosomal protein biogenesis 
is mainly exerted at the transcriptional 
level, through the presence of a common 
upstream-activating element (UAS^) 
that is recognized by the Rapl DNA-bina- 
ing protein (31, 32). The expression pro- 
files of seven ribosomal proteins are shown 
in Fig. 5F. A search of the sequences 
upstream of all seven genes revealed con- 
sensus Rapl -binding motifs (33). It has 
been suggested that declining Rapl levels 
in the cell during starvation may be re- 
sponsible for the decline in ribosomal pro- 
tein gene expression (34). Indeed, we ob- 
served that the abundance of RAP I 
mRNA diminished by 4.4-fold, at about 
the time of glucose exhaustion. 

Of the 149 genes that encode known or 
putative transcription factors, only two, 
HAP4 and S1P4, were induced by a factor of 
more than threefold at the diauxic shift. 
S1P4 encodes a DNA-binding transcrip- 
tional activator that has been shown to 
interact with Snfl, the "master regulator" of 
glucose repressi n (35). The eightfold in- 
duction of S1P4 upon depletion of glucose 
strongly suggests a role in the induction of 



downstream genes at the diauxic shift. 

Although most of the transcriptional 
responses that we observed were n t pre- 
viously known, the responses f many 
genes during the diauxic shift have been 
described. Comparison of the results we 
obtained by DNA microarray hybridiza- 
tion with previously reported results there- 
fore provided a strong test of the sensitiv- 
ity and accuracy of this approach. The 
expression patterns we observed for previ- 
ously characterized genes showed almost 
perfect concordance with previously pub- 
lished results (36). Moreover, the differ- 
ential expression measurements obtained 
by DNA microarray hybridization were re- 
producible in duplicate experiments. For 
example, the remarkable changes in gene 
expression between cells harvested imme- 
diately after inoculation and immediately 
after the diauxic shift (the first and sixth 
intervals in this time series) were mea- 
sured in duplicate, independent DNA mi- 
croarray hybridizations. The correlation 
coefficient for two complete sets of expres- 
sion ratio measurements was 0.87, and for 
more than 95% of the genes, the expres- 



sion ratios measured in these duplicate 
experiments differed by less than a factor 
f 2. H wever, in a few cases, there were 
discrepancies between our results and pre- 
vious results, pointing to technical limita- 
tions that will need to be addressed as 
DNA microarray technology advances 
(37, 38). Despite the noted exceptions, 
the high concordance between the results 
we obtained in these experiments and 
those of previous studies provides confi- 
dence in the reliability and thoroughness 
of the survey. 

The changes in gene expression during 
this diauxic shift are complex and involve 
integration of many kinds of information 
about the nutritional and metabolic state 
of the cell. The large number of genes 
whose expression is altered and the diver- 
sity of temporal expression profiles ob- 
served in this experiment highlight the 
challenge of understanding the underlying 
regulatory mechanisms. One approach to 
defining the contributions of individual 
regulatory genes to a complex program of 
this kind is to use DNA microarrays to 
identify genes whose expression is affected 



Rg. 2. The section of the ar- 
ray indicated by the gray box 
in Fig. 1 is shown for each of 
the experiments described 
here. Representative genes 
are labeled. In each of the ar- 
rays used to analyze gene 
expression during the diauxic 
shift, red spots represent 
genes that were induced rel- 
ative to the initial timepocnt. 
and green spots represent 
genes that were repressed 
relative to the initial timepoint. 
In the arrays used to analyze 
the effects of the tup lb mu- 
tation and YAP1 overexpres- 
sion, red spots represent 
genes whose expression was 
increased, and green spots 
represent genes whose ex- 
pression was decreased by 
the genetic modification. Note 
that distinct sets of genes are 
induced and repressed in the 
different experiments. The 
complete images of each of 
these arrays can be viewed on 
the Internet (73). Cell density 
as measured by optical densi- 
ty (OD) at 600 nm was used to 
measure the growth of the 
culture. 
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by mutations in each putative regulatory 
gene. As a test of this strategy, we analyzed 
the genomewide changes in gene expression 
that result from deletion of the TUP J gene. 
Transcriptional repression of many genes by 
glucose requires the DNA-binding repressor 



Migl and is mediated by recruiting the tran- 
scriptional co- repressors Tupl and Cycfi/ 
Ssn6 (39). Tupl has also been implicated in 
repression of oxygen-regulated, mating-type- 
specific, and DNA-damage-inducible genes 
(40). 
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Fig. 3. Metabolic reprogramrning inferred from global analysis of changes In gene expression. Only key 
metabolic intermediates are identified. The yeast genes encoding the enzymes that catalyze each step 
in this metabolic circuit are identified by name in the boxes. The genes encoding sucanyl-CoA synthase 
and glyc»gen-oebranching enzyme have not been explicitly identified, but the ORFs YGR244 and 
YPR184 show significant homology to known succinyl-CoA synthase and glycogen-debranching en- 
zymes, respectively, and are therefore included in the corresponding steps in this figure. Red boxes with 
white lettering identify genes whose expression increases in the diauxic shift. Green boxes with dark 
green lettering identify genes whose expression diminishes in the diauxic shift. The magnitude of 
induction or repression is indicated for these genes. For muttimeric enzyme complexes, such as 
succinate dehydrogenase, the indicated fold-induction represents an unweighted average of all the 
genes listed in the box. Black and white boxes indicate no significant differential expression (less than 
twofold). The direction of the arrows connecting reversible enzymatic steps indicate the direction of the 
flow of metabolic intermediates, inferred from the gene expression pattern, after the diauxic shift. Arrows 
representing steps catalyzed by genes whose expression was strongly induced are highlighted in red. 
The broad gray arrows represent major increases in the flow of metabolites after the diauxic shift, 
inferred from the indicated changes in gene expression. 



Wild-type yeast cells and cells bearing 
a deletion f the TUP] gene (tupl A) were 
gr wn in parallel cultures in rich medium 
containing glucose as the carbon source. 
Messenger RNA was isolated from expo- 
nentially growing cells from the two pop- 
ulations and used to prepare cDNA la- 
beled with Cy3 (green) and Cy5 (red), 
respectively (J J). The labeled probes were 
mixed and simultaneously hybridued to 
the microarray. Red spots on the microar- 
ray therefore represented genes whose 
transcription was induced in the cup I A 
strain, and thus presumably repressed by 
Tupl {41). A representative section of the 
microarray (Fig. 2, bottom middle panel) 
illustrates that the genes whose expression 
was affected by the tup J A mutation, were, 
in general, distinct from those induced 
upon glucose exhaustion [complete images 
of all the arrays shown in Fig. 2 are avail- 
able on the Internet (13)). Nevertheless, 
34 (10%) of the genes that were induced 
by a factor of at least 2 after the diauxic 
shirt were similarly induced by deletion of 
TUP1 , suggesting that these genes may be 
subject to TUP1 -mediated repression by 
glucose. For example, SUC2, the gene en- 
coding invertase, and all five hexose trans- 
porter genes that were induced during the 
course of the diauxic shift were similarly 
induced, in duplicate experiments, by the 
deletion of TVPl . 

The set of genes affected by Tupl in this 
experiment also included a-glucosidases, 
the mating-type-specific genes MFAJ and 
MFA2, and the DNA damage-inducible 
RNR2 and RNR4, as well as genes involved 
in flocculation and many genes of unknown 
function. The hybridization signal corre- 
sponding to expression of TVPl itself was 
also severely reduced because of die (in- 
complete) deletion of the transcription unit 
in the tup J A strain, providing a positive 
control in the experiment (42). 

Many of the transcriptional targets of 
Tupl fell into sets of genes with related 
biochemical functions. For instance, al- 
though only about 3% of all yeast genes 
appeared to be TUP1 -repressed by a factor 
of more than 2 in duplicate experiments 
under these conditions, 6 of the 13 genes 
that have been implicated in flocculation 
(15) showed a reproducible increase in 
expression of at least twofold when TVPl 
was deleted. Another group of related 
genes that appeared to be subject to TUPJ 
repression encodes the serine-rich cell 
wall mannoproteins, such as Tipl and 
Tirl/Srpl which are induced by cold 
shock and other stresses (43), and similar, 
serine-poor proteins, the seripauperins 
(44). Messenger RNA levels for 23 of the 
26 genes in this group were reproducibly 
elevated by at least 2.5-fold in the tupl A 
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strain, and 18 of these genes were induced 
by more than seven/ Id when TUP J was 
deleted. In contrast, n ne of 83 genes that 
could be classified as putative regulators of 
the cell division cycle were induced more 
than twofold by deletion of TUP J. Thus, 
despite the diversity of the regulatory sys- 
tems that employ Tupl, most of the genes 
that it regulates under these conditions 
fall into a limited number of distinct func- 
tional classes. 

Because the microarray allows us to 
monitor expression of nearly every gene in 
yeast, we can, in principle, use this ap- 
proach to identify all the transcriptional 
targets of a regulatory protein like Tupl. It 
is important to note, however, that in any 
single experiment of this kind we can only 
recognize those target genes that are nor- 
mally repressed (or induced) under the 
conditions of the experiment. For in- 
stance, the experiment described here an- 
alyzed a MAT a strain in which MFAJ 
and MFA2, the genes encoding the a- 
factor mating pheromone precursor, are 
normally repressed. In the isogenic tup] A 
strain, these genes were inappropriately 
expressed, reflecting the role that Tupl 
plays in their repression. Had we instead 
carried out this experiment with a MATA 
strain (in which expression of MFAJ and 
MFA2 is not repressed), it would not have 
been possible to conclude anything re- 
garding the role of Tupl in the repression 
of these genes. Conversely, we cannot dis- 
tinguish indirect effects of the chronic 
absence of Tupl in the mutant strain from 
effects directly attributable to its partici- 
pation in repressing the transcription of a 
gene. 

Another simple route to modulating the 
activity of a regulatory factor is to overex- 
press the gene that encodes it. YAPJ en- 
codes a DNA-binding transcription factor 
belonging to the b-zip class of DNA-bind- 
ing proteins. Overexpression of YAPI in 
yeast confers increased resistance to hydro- 
gen peroxide, o-phenanthroline, heavy 
metals, and osmotic stress (45). We ana- 
lyzed differential gene expression between a 
wild-type strain bearing a control plasmid 
and a strain with a plasmid expressing YAP] 
under the control of the strong GAD -10 
promoter, both grown in galactose (that is, 
a condition that induces YAPI overexpres- 
sion). Complementary DNA from the con- 
trol and YAPI overex pressing strains, la- 
beled with Cy3 and CyS, respectively, was 
prepared from mRNA isolated fr m the two 
strains and hybridized to the microarray. 
Thus, red spots on the array represent genes 
that were induced in the strain overexpress- 
ing YAPJ. 

Of the 17 genes whose mRNA levels 
increased by more than threefold when 



YAPJ was overexpressed in this way, five 
bear homology to aryl-alcohol oxidoreduc- 
tases (Fig. 2 and Table 1). An additional 
four f the genes in this set also belong to 
the general class of dehydrogenases/oxi- 
doreductases. Very little is known about 
the role of aryl-alcohol oxidoreductases in 
S. cerevisiae, but these enzymes have been 
isolated from ligninolytic fungi, in which 
they participate in coupled redox reac- 
tions, oxidizing aromatic, and aliphatic 
unsaturated alcohols to aldehydes with the 
production of hydrogen peroxide (46, 47). 
The fact that a remarkable fraction of the 
targets identified in this experiment be- 
long to the same small, functional group of 
oxidoreductases suggests that these genes 



Fig. 4. Coordinated reg- 
ulation of functionally re- 
lated genes. The curves 
represent the average in- 
duction or repression ra- 
tios for at! the genes in 
each indicated group. 
The total number of 
genes in each group was 
as follows: ribosoma! 
proteins. 1 12; translation 
elongation and initiation 



might play an important protective role 
during xidative stress. Transcription fa 
small number of genes was reduced in die 
strain verexpressing Yapl. Interestingly, 
many of these genes encode sugar per- 
meases or enzymes involved in inositol 
metabolism. 

We searched for Yapl-binding sites 
(TTACTAA or TGACTAA) in the se- 
quences upstream of the target genes we 
identified (48). About two-thirds of the 
genes that were induced by more than 
threefold upon Yapl overexpression had 
one or more binding sites within 600 bases 
upstream of the start codon (Table 1), sug- 
gesting that they are directly regulated by 
Yapl. The absence of canonical Yapl-bind- 
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factors. 25; tRNA synthetases (excluding mitochondiaJ synthetases). 17; glycogen and trehalose syn- 
thesis and degradation, 15; cytochrome c oxidase and reductase proteins. 19; and TCA- and arvoxv- 
late-cycle enzymes. 24. 7 

Table 1 . Genes induced by YAPi overexpression. This list includes ail the genes for which mRNA levels 
increased by more than twofold upon YAPI overexpression in both of two duplicate experiments, and 
tor which the average increase in mRNA level in the two experiments was greater than threefold (50) 
Positions of the canonical Yapl binding sites upstream of the start codon. when present, and the 
average fold-increase in mRNA levels measured in the two experiments are indicated. 
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YHR179W 
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Description 



FoJd- 



Putatrve aryl-alcohol reductase 
Similarity to bacterial csgA protein 
Transcriptional activator involved in 

oxidative stress response 
Homology to aryl-alcohol 

dehydrogenases 
Putative glutathione transferase 
Putative aryl-alcohol dehydrogenase 

(NADP+) 

Putative aryl-alcohol reductase 
Aminotriazole and 4-nitroquinoline 

resistance protein 
Homology to benomyt/methotrexate 

resistance protein 
Hypothetical protein 
Putative aryl-alcohol dehydrogenase 
NAPOH dehydrogenase (old yellow 

enzyme), isoform 3 
Homology to hypothetica! proteins 

YCR102C and YNL134C 
Homology to hypothetical protein 

YMR251w 
NAD(P)H oxidoreductase (old yellow 

enzyme), isoform 1 
Similarity to A thaliana zeta-crystailin 

hornciog 
Ma late dehydrogenase 
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tag sites upstream of the then may reflect 
an ability of Yapl to bind sites that differ 
from the canonical binding sites, perhaps in 
cooperation with ther factors, or less like* 
ly» may represent an indirect effect of Yapl 
overexpression, mediated by one or more 
intermediary factors. Yapl sites were found 
only four times in the corresponding region 
of an arbitrary set of 30 genes that were not 
differentially regulated by Yapl. 

Use of a DNA microarray to character- 
ize the transcriptional consequences of 
mutations affecting the activity of regula- 
t ry molecules provides a simple and pow- 
erful approach to dissection and character- 
ixati n of regulatory pathways and net- 



w rks. This strategy also has an important 
practical application in drug screening. 
Mutations in specific genes encoding can- 
didate drug targets can serve as surr gates 
for the ideal chemical inhibitor or modu- 
lator of their activity. DNA microarrays 
can be used to define the resulting signa- 
ture pattern of alterations in gene expres- 
sion, and then subsequently used in an 
assay to screen for compounds that repro- 
duce the desired signature pattern. 

DNA microarrays provide a simple and 
economical way to explore gene expres- 
sion patterns on a genomic scale. The 
hurdles to extending this approach to any 
other organism are minor. The equipment 
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Tim* (hour*) 

^^1°^ t f mporal 01 ****ton or repression help to group genes that share regulatory 

properties (A) Temporal profile of the cell density, as measured by OD at 600 nm and glucose 
J^^f^^^f* fj^en genes exhibited a strong induction (greater than ninefold) only at 
the last tmepont (20.5 hours). With the exception of IDP2. each of these genes has a CSRE UAS There 
were no additional genes observed to match this profile. (C) Seven members of a class of genes marked 
by early inductcn with a peak in mRNA levels at 1 8.5 hours. Each of these genes contain STRE motif 
repeats m their upstream promoter regions. (D) Cytochrome c oxidase and ubiquinoi cytochrome c 
reductase genes. Marked by an induction coincident with the diauxic shift, each of these genes contains 
a consensus Jxnding motif for the HAP2.3.4 protein complex. At least 17 genes shared a similar 
expression profile. (E) SAM1. GPP1, and several genes of unknown function are repressed before the 
diauxic shift, and continue to be repressed upon entry into stationary phase. (F) Ribosomal protein 
genes* comprise a ferge class of genes that are repressed upon depletion of glucose. Each of the genes 
profiled herecontains one or more RAPl -binding motifs upstream of its promoter. RAP1 is a trarecrip. 
tonal regulator c4 mc^ rtoosomal proteins. ^ 



Reports 



required for fabricating and using DNA 
microarrays (9) consists of components 
that were chosen for their modest cost and 
simplicity. It was feasible for a small group 
to accomplish the amplification of more 
than 6000 genes in about 4 months and, 
once the amplified gene sequences were in 
hand, only 2 days were required to print a 
set of 110 microarrays of 6400 elements 
each. Probe preparation, hybridization, 
and fluorescent imaging are also simple 
procedures. Even conceptually simple ex- 
periments, as we described here, can yield 
vast amounts of information. The value of 
the information from each experiment of 
this kind will progressively increase as 
more is learned about the functions f 
each gene and as additional experiments 
define the global changes in gene expres- 
sion in diverse other natural processes and 
genetic perturbations. Perhaps the greatest 
challenge now is to develop efficient 
methods for organizing, distributing, inter- 
prcting, and extracting insights from the 
large volumes of data these experiments 
will provide. 

REFERENCES AND NOTES 

1. M. Schena, D. Shalon. R. W. Davis. P. O. Brown, 
Scenes 270, 467 (1995). 

2. 0. Shalon, S. J. Smith. P. O. Brown, Genome flas. 6, 
639(1996). 

3. 0. lashkari, Prcc Natl. Acad. Set U&A. k\ press. 

4. J. DeRisi ef at. Nature Genet 14, 457 (1996). 

5. 0. J. Lcckhart ef at. Nature BotechnoL 14. 1676 
(1996). 

6. M.CheeafaJL.Scwnoe 274, 610(1996). 

7. M. Johnston and M. Carlson, in The Mohcutar &oh 
ogyofthe Yeast Saccharomyces: Gene Express&i, 
E. W. Jones, J. R. Pringle, j. R. Broach. Eds. (Cold 
Spring Harbor Laboratory Press. Cold Sprra Har- 
bor. NY. 1992). p. 193. 

8. Primers for each known or predicted protein cod*? 
seouence were supplied by Research Genetics. 
PCR was performed with the p r ot ocol Supplied by 
Research Genetics, using genomic DNA from yeast 

strain S286C as a template. Each PCR product was 
verified by agarose gel electrophoresis and was 
oeemea correct il the lane contained a sjn>e band of 
appropriate mobility. Failures were marked as such 
in the database. The overall success rate for a arigte- 
pass amplification of 61 1 6 ORFs was -94.5%. 

9. Glass slides (Gold SeaO were cleaned tor 2 hours h a 
solution of 2 N NaOH and 70% ethanoL After rinsing 
in distilled water, the shoes were then treated with a 
1:5 dilution of pory-t-r/sro adhesive solution (Sg- 
ma) for 1 hour, and then dried for 5 min at 40*C In a 
vacuum oven. DNA samples from 100-|d PCR reac- 
tions were purified by ethanoi purification r> 06 wet 
microliter plates. The resulting precipitates were re- 
suspended in 3x standard saline citrate (SSQ and 
transferred to new plates for arraying. A custom-buft 
arraying robot was used to print on a batch of 110 
slides. Delate of the design of the mfcroarrayer era 
available at cmgm.stanford.edu/pbrowa After pitt- 
ing, the microarrays were rehydrated tor 30 t ii a 
humid chamber and then snap-dried tor 2 a on a hot 
plate (100*C). The DM4 was then uftraviotet (UV)- 
crossfcnked to the surface by subjecting the slides to 
60 mj of energy (Stratagene StrataSnker). The rest of 
the pory-t-tysine surface was blocked by a 15-mti 
incubation in a solution of 70 mM sucoric anhydnde 
dissolved in a solution con sis ting of 315 ml of 1- 
methyf-2-pyrrolidinone (Aldrich) and 35 mf of 1 M 
boric add (pH 8.0). DirecOy after the btoc kfc ig 



www.scicnccmag.org • SCIENCE • VOL 278 • 24 OCTOBER 1997 



ton, the bound DMA was denatured by a 2-min irv 
cubation h dotted water at -05*C. The sides ware 
then transferred Wo a bath of 1 00% ethanoi at room 
temperature, rinsed, and then spun dry in a drtcaJ 
cantrtuge. SSdes ware stored in a dosed box at 
room temperature unti used. 
10- YPD medium (8 Iters), in a 104ter termentalion 
vessel, was jnocutated with 2 ml of a fresh over- 
night culture of yeast strain DBY7286 fWATa, ura3. 
GAL2). The fermentor was maintained at 30*C with 
constant agitation and aeration. The glucose con- 
tent of the media was measured with a UV test Mt 
(Boehringer Mannheim, catalog number 716251) 
Gefl density was measured by OD at 600-nm wave- 
length. Aiquots of culture were rapidly withdrawn 
from the fermentation vessel by peristaltic pump, 
spun down at room temperature, and then flash 
frozen with liQuid nrlrogen. Frozen cells were stored 
at-KTC. 

11. Cy3-dUTP or Cy5-dUTP (Amersham) was incorpo- 
rated during reverse transcription of 1.25 of 
por/adenylated (pory(A)*] RNA, primed by a cT(16) 
oligomer. This mixture was heated to 70*C for 10 
min, and then tr an sferred to ice. A promoted solu- 
tion, consisting of 200 U Superscript II (Gibco), 
buffer, cteorynbonudeosjde triphosphates, and flu- 
orescent nucleotides, was added to the RNA. Nu- 
cleotides were used at these final concentrations: 
500 m-M for dATP, dCTP, and dGTP and 200 )iM 
for dTTP. Cy3-dUTP and CyS-dUTP were used at 
a final concentration of 100 >tM. The reaction was 
then incubated at 42*C for 2 hours. Unincorporat- 
ed fluorescent nucleotides were removed by first 
diluting the reaction mixture with of 470 mJ of 10 
mM tris-HCl (pH 8.0J/1 mM E0TA and then subse- 
quently concentrating the mix to -5 »d. using Cen- 
tricon-30 microconcentrators (Arnicon). 
12. Purified, labeled cONA was resuspended in 11 of 
3.5 x SSC containing 10 potydA) and 0.3 of 
10% SDS. Before hybridization, the solution was 
boiled for 2 min and then allowed to coot to room 
temperature. The solution was applied to the mi- 
croarrey under a cover slip, and the slide was 
placed in a custom hybridization chamber which 
was subsequently incubated for -8 to 12 hours ri 
a water bath at 62*0. Before scanning, slides were 
washed in 2x SSC. 0.2% SOS for 5 min. and then 
0.05x SSC lor 1 min. Slides were dried before 
scanning by centrifugation at 500 rpm in a Beck- 
man CS-6R centrifuge. 

13. Thecompiete data set is available on the Internet at 
cmgm.stanford.edu/pbrcwrv'exp^ 

14. For 95% of al the genes analyzed, the mRNA levers 
measured in ceQs harvested at the first and second 
interval after inoculation differed by a factor of less 
than 1 .5. The correlation coefficient for the compar- 
ison between mRNA levels measured fa each gene 
in these two drrterent mRNA samples was 0.9a 
When duplicate mRNA preparations from the same 
cell sample were compared in the same way. the 
corrector! coefficient between the expassson levels 
measured for the two samples by comparative hy- 
bricSzation was 0.99. 

15. The numbers and identities of known and putative 
genes, and their homologies to other genes, were 
gathered from the Wowing pubic databases: Sec- 
cha/omyces Genome Database (genome-www. 
stanford.edu). Yeast Protein Database (ouest7. 
proteorrte.com), and Munich Information Centre for 
Protein Sequences (speedy jiiipsJxochem jnpg.de/ 
mips/yeast/inoexhtrr*4 

16. A. Schoter and K J. Schuier. Mot CeO. BioL U. 
3613(1994). 

17. S. Kratzer and H. J. Schuier, Gene 161, 75 (1995). 
16. R. J. Hasetoeck and K L McAfeter. J. BioL Chem. 

268,12116(1993). 

19. M. Fernandez, E Fernandez, R Rodkao. Mot. Gen 
Genet 242,727(1994). 

20. A. Hartjg ef a/.. NudecAoOs fles. 20. 5677 (1992). 

21. P. M. Martinez efal.fiMBOJL 15. 2227 (1996). 

22. J. C. Varete, U. M. Praekeft. P. A. Meacock, R. J. 
Planta, W. H. Maoer.Mo/. CeO. BioL 15.6232(1995). 

23. H. Rub and C. Schuier. Bioessays 17, 959 (1995). 

24. J. L Parrou. M. A, Teste. J. Francois. M&ob&ogy 
143, 1891 (1997). 



25. This expression profle was defined as havtxj an 
induction of greater than 10-fold at 18.5 hours and 
less than 1 1 -fold at 20.5 hours, 

26. S. LForsburg and LGuarente. Genes Dev. 3,1166 
(1989). 

27. J. T. Oesen and L Guarente, tokL 4. 1714(1990. 
2a M. Ftosenkrantz. C. S. Kel. E. A. Pennei L J. Oe- 

venish. Mot. Microbe/. 13. 119 (1994). 
29. Single-letter abbreviations for the amino acid resi- 
dues are as folows: A, Ala; C, Cys; D. Asp; E. Qu; F, 
Phe;G.Gry; H, His; I. He; K, Lys; U Leu; M. Met N. 
Asa P. Pro; Q, Gin; R. Arg; S. Ser. T, Thr; V, Vaf ; W, 
Trp; and Y, Tyr. The nucleotide cedes areas folows: 
B-C, G, orT; N-G. A, T, or C; R-A or G; and Y-C or 



T. 

30. C. Fondrat and A. Kalog^ropoutas. CompuL Appt. 
&bsa 12.363(1996). 

31. D. Shore, Trends Genet 10. 408 (1994). 

32. R. J. Planta and H. A. Raue. iud. 4, 64 (1988). 

33. The degenerate consensus seouwee VYCYRNNC- 
MNH was used to search tor potential RAPi-bndng 
sites. The exact consensus, as defined by (30). is 
WACAYCCRTACATYW. with up to three clarenc- 
es atowed. 

34. S. F. Neurnan, S. Bhattacharya, J. R. Broach. Mot 
CelL Biol. 15, 3167 (1995). 

35. P. lesage. X Yang. M. Carlson, ibid. 16. 1921 
(1996). 

36. For example, we observed large inductions of the 
genes coding for PCKi, FBPl \Z Ym ef a/., Mot. 
Mtcrobioi. 20. 751 (1996)). the central gryoxytate 
cycle gene ICL1 (A. Schoter and H. J. Schufler. 
dm. Genet 23. 375 (1993)]. and the "aerobic- 
rsoform of Bcetyl-CoA synthase. ACS 7 fM. A. van 
den Berg etaL.j. Biot. Chem. 271 . 28953 (1 9961). 
with concomitant down-regulation of the gtycoiyt- 
ic-specific genes PYX 7 end PFK2 (P. A. Moore ef 
a/.. Mo/. Cett. Biol. 11. 5330 (1991)]. Other genes 
not directly involved in carbon metabolism but 
known to be induced upon nutrient limitation *v- 
clude genes encoding cytosoiic catalase T C7T7 
IP. H. Bissrnger ef at., ibid. 9. 1309 (1989)) and 
several genes encoding small heat-shock proteins 
such as HSP12. HSP26, and HSP42 [I. Farkas ef 
at. J. Biol. Chem. 266, 15602 (1991); U. M. 
Praekelt and P. A. Meacock. Mot. Gen. Genet 223 
97 (1990); D. Wotton ef at.. J. Biot. Chem. 271, 
2717(1996)). 

37. The levels of induction we measured for genes that 
were expressed at very low levels in the urvnduced 
state (notably, FBPl anQ PCKI) were generally lower 
than those previously reported. The discrepancy 
wasikery due to the conservative background sub- 
traction method we used, which generally resulted in 
ovftrestWviation of very low expression levels (46). 

38. <>oss-rTybhdization of highly related sequences can 1 
ateo occasionally obscure changes in gene expres- 
sion, an important concern where members erf owe 
tamftes are functionally specialized and differenteBy 
regulated. The major alcohol Dehydrogenase genes 
ADH1 and ADH2. share 88% nucleotide identity! 
Reciprocal regulation of these genes is an important 
feature of the d*auxic shift, but was not observed in 
5^?? enment ' P resurn at>ry because of cross-hy- 
bridization of the fluorescent cDNAs representing 

these tW0 ° enes ' Neverthei ess. we were able to de- 
tect differentia! expression of ctosery related isof orms 
of other enzymes, such as HXK1/HXK2 (77% iden- 
tical) [P. Herrero ef a/. . yeast 11. 137 (1 995)]. MLS1/ 
DA17(73% identical) (20). and PGM1/PGM2 (72% 
identical) (D. Oh, J. E. Hopper. Mot. Cell. Biot. 10. 
141 5(1990)J. in accord with previous studies. Use n 
the microariay of oekberatery selected DNA se- 
QUBnces ccxrespondrtg to the most divergent seg- 
ments of homologous genes, in feu of the complete 
gene sequences, should relieve this problem in many 



39. F. E. WEams. u. Varanasi, R. J. Trumbly. Mot. Celt. 
Biot. 11.3307 (1991). 

40. 0. Tzamarias and K. Stnjhl. Nature 369. 758 (1994). 

41. Differences in mRNA levels between the tup 1 A and 
wSd-type strain were measured in two independent 
experiments. The correlation coefficient between the 
complete sets of expression ratios measured in 
these duplicate experiments was 0.83. The concor- 



dance betwaen the sets of gerws that 
be nduced was vary high between the two^, 
ments. \rvhen orty the 355 genes that Showed at 
teast a twofold increase in mRNA in the tupl A stnah 
* either of the oupficate eparimer* w«» com- 
pared, the correlation ocerToei < was 082. 
41 The tuplA mutation consists of an insertion of the 
LEU2 cooing sequence, inducing a stop oodon. be- 
tween the ATG of TUP 1 and an Eoo R I site 1 24 base 
pars before the stop codon of the 7UP7 gene. 

43. L R. Kowaiski, K. Kondo. M. Inouya, Mot Mcmtioi 
15.341(1995). 

44. M. vfewanathan, G. Muthukumar, Y. S. Cong. J. 
Lenard. Gene 148. 149 (1994). 

45. D. Hirata. K. Yano. T. Mryakawa, Moi Gen. Genet 
242.250(1994). 

46. A. Gutierrez. L Carameto. A. Prieto. M. J. Martinez. 
A. T. Martinez. AppL Environ. Microbiol 60. 1783 
(1994). 

47. A. Muheim ef ai. Br. J. aocnam. 195.369(1991). 

48. J. A. Wemmie. M. S. Szczypka, D. J. Thiele. W. S. 
Moye-Rowtey. J. Biot Chem. 269. 32592 (1994). 

49. Mooarrays were scanned using a custom-buK 
scanning laser microscope bult by S. Smith wth 
software written by N. 2v. Delate cono e nwig sew 
ner design and construction am evat^le at cmgm. 
stanford.ecXj/pbrown. Images were scanned at a 
resokition of 20 t*m per paeL A separate scan, using 
the appropriate excitation fine, was done for each of 
the two ft u o ro phora s used. During the sca rrwr g pro- 
cess. the ratio between the signals in the two chan- 
nels was calculated for several array elements con- 
taining total genomic DNA, To normafize the two 
channels with respect to overafl intensity, we then 
adjusted prtotomuttipfier and laser power settings 
such that the &gnal rabo at these elements was as 
close to 1 .0 as possbte. The cemtxhed images were 
analyzed with custom-written software. A boundng 
box, fitted to the size of the DMA spots in each 
Quadrant was placed over each array element The 
average fluorescent intensity was calculated by aurn- 
ming the intensities of each pixel present in a bound- 
ing box. and then dwiding by the total number of 
pixels. Local area background was calculated for 
each array element by determining the average ftjo- 
rescent intensity for the tower 20% of pixel intensi- 
ties. Although this method tends to underestimate 
the background, causing an underestimation of ex- 
treme ratios, It produces a very consistent and 
tolerant approximation. Although the 
digital board used for data collection pos a 
wide dynamic range (12 bits), several signals ware 
saturated (greater than the maximum signd intensity 
allowed) at the chosen settings. Therefore, extreme 
ratios at bright elements are generaly underestimat- 
ed. A signal was deemed signracant if the average 
intensity after background subtraction was at least 
2.5-fold higher than the standard deviatton in the 
background measurements for al elements on the 
array. 

50. In addition to the 17 genes shown in Table 1. three 
additional genes were induced by an average of 
more than threefold in the duplicate ex p erimen ts , but 
in one of the two experiments, the induction was law 
than twofold (range 1.6- to 1.9-fold) 

51. We thank H. Bennett. P. Spettrnan. J. Ravetto. M. 
Bsen. R. Pttai. B. Dunn. T. Ferea. and other mem- 
bers of the Brown lab tor their assistance and helpful 
advice. We also thank S. Friend. D. Botstein. a 
Smith, J. Hudson, and D. Dolginow for advice, sup- 
port, and encouragement; K. Stnjhl and S. Chatter- 
jee for the Tupi deletion strain; L remanda s lor 
helpful advice on Yapi; and S. Kiaphotz and the 
reviewers for many helpful comments on the manu- 
script. Supported by a grant from the National Hu- 
man Genome Research Institute (NHGRQ 
(HG00450), and by the Howard Hughes Medea! In- 
stitute (HHMI). J.D.R. was supported by the HHMI 
and the NHGRl. V.R. was supported in part by an 
Institutional Training Grant in Genome Science (T32 
HG00044) from the NHGRl. P.O.B. is an associate 
investigator of the HHMI. 

5 September 1997; accepted 22 September 1997 



666 



SCIENCE - VOL 278 • 24 OCTOBER 1997 • www.scicnccmag.org 



